A Novel Approach for Transcription Factor Analysis Using SELEX with High-Throughput Sequencing (TFAST)

نویسندگان

  • Daniel J. Reiss
  • Frederick M. Howard
  • Harry L. T. Mobley
چکیده

BACKGROUND In previous work, we designed a modified aptamer-free SELEX-seq protocol (afSELEX-seq) for the discovery of transcription factor binding sites. Here, we present original software, TFAST, designed to analyze afSELEX-seq data, validated against our previously generated afSELEX-seq dataset and a model dataset. TFAST is designed with a simple graphical interface (Java) so that it can be installed and executed without extensive expertise in bioinformatics. TFAST completes analysis within minutes on most personal computers. METHODOLOGY Once afSELEX-seq data are aligned to a target genome, TFAST identifies peaks and, uniquely, compares peak characteristics between cycles. TFAST generates a hierarchical report of graded peaks, their associated genomic sequences, binding site length predictions, and dummy sequences. PRINCIPAL FINDINGS Including additional cycles of afSELEX-seq improved TFAST's ability to selectively identify peaks, leading to 7,274, 4,255, and 2,628 peaks identified in two-, three-, and four-cycle afSELEX-seq. Inter-round analysis by TFAST identified 457 peaks as the strongest candidates for true binding sites. Separating peaks by TFAST into classes of worst, second-best and best candidate peaks revealed a trend of increasing significance (e-values 4.5 × 10(12), 2.9 × 10(-46), and 1.2 × 10(-73)) and informational content (11.0, 11.9, and 12.5 bits over 15 bp) of discovered motifs within each respective class. TFAST also predicted a binding site length (28 bp) consistent with non-computational experimentally derived results for the transcription factor PapX (22 to 29 bp). CONCLUSIONS/SIGNIFICANCE TFAST offers a novel and intuitive approach for determining DNA binding sites of proteins subjected to afSELEX-seq. Here, we demonstrate that TFAST, using afSELEX-seq data, rapidly and accurately predicted sequence length and motif for a putative transcription factor's binding site.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TECHNICAL REPORT High-throughput SELEX–SAGE method for quantitative modeling of transcription-factor binding sites

The ability to determine the location and relative strength of all transcription-factor binding sites in a genome is important both for a comprehensive understanding of gene regulation and for effective promoter engineering in biotechnological applications. Here we present a bioinformatically driven experimental method to accurately define the DNA-binding sequence specificity of transcription f...

متن کامل

HTPSELEX—a database of high-throughput SELEX libraries for transcription factor binding sites

HTPSELEX is a public database providing access to primary and derived data from high-throughput SELEX experiments aimed at characterizing the binding specificity of transcription factors. The resource is primarily intended to serve computational biologists interested in building models of transcription factor binding sites from large sets of binding sequences. The guiding principle is to make a...

متن کامل

RAPID-SELEX for RNA Aptamers

Aptamers are high-affinity ligands selected from DNA or RNA libraries via SELEX, a repetitive in vitro process of sequential selection and amplification steps. RNA SELEX is more complicated than DNA SELEX because of the additional transcription and reverse transcription steps. Here, we report a new selection scheme, RAPID-SELEX (RNA Aptamer Isolation via Dual-cycles SELEX), that simplifies this...

متن کامل

Predicting transcription factor binding motifs from DNA-binding domains, chromatin accessibility and gene expression data

Transcription factors (TFs) play crucial roles in regulating gene expression through interactions with specific DNA sequences. Recently, the sequence motif of almost 400 human TFs have been identified using high-throughput SELEX sequencing. However, there remain a large number of TFs (∼800) with no high-throughput-derived binding motifs. Computational methods capable of associating known motifs...

متن کامل

An Improved SELEX-Seq Strategy for Characterizing DNA-Binding Specificity of Transcription Factor: NF-κB as an Example

SELEX-Seq is now the optimal high-throughput technique for characterizing DNA-binding specificities of transcription factors. In this study, we introduced an improved EMSA-based SELEX-Seq strategy with several advantages. The improvements of this strategy included: (1) using a FAM-labeled probe to track protein-DNA complex in polyacrylamide gel for rapidly recovering the protein-bound dsDNA wit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2012